Collective Framework and Performance Optimizations to Open MPI for Cray XT Platforms
نویسندگان
چکیده
The performance and scalability of collective operations plays a key role in the performance and scalability of many scientific applications. Within the Open MPI code base we have developed a general purpose hierarchical collective operations framework called Cheetah, and applied it at large scale on the Oak Ridge Leadership Computing Facility’s Jaguar (OLCF) platform, obtaining better performance and scalability than the native MPI implementation. This paper discuss Cheetah’s design and implementation, and optimizations to the framework for Cray XT 5 platforms. Our results show that the Cheetah’s Broadcast and Barrier perform better than the native MPI implementation. For medium data, the Cheetah’s Broadcast outperforms the native MPI implementation by 93% for 49,152 processes problem size. For small and large data, it out performs the native MPI implementation by 10% and 9%, respectively, at 24,576 processes problem size. The Cheetah’s Barrier performs 10% better than the native MPI implementation for 12,288 processes problem size.
منابع مشابه
An Evaluation of Open MPI's Matching Transport Layer on the Cray XT
Open MPI was initially designed to support a wide variety of high-performance networks and network programming interfaces. Recently, Open MPI was enhanced to support networks that have full support for MPI matching semantics. Previous Open MPI efforts focused on networks that require the MPI library to manage message matching, which is sub-optimal for some networks that inherently support match...
متن کاملA Comparison of Application Performance Using Open MPI and Cray MPI
Open MPI is the result of an active international Open-Source collaboration of Industry, National Laboratories, and Academia. This implementation is becoming the production MPI implementation at many sites, including some of DOE’s largest Linux production systems. This paper presents the results of a study comparing the application performance of VH-1, GTC, the Parallel Ocean Program, and S3D o...
متن کاملImplementation of Open MPI on the Cray XT3
The Open MPI implementation provides a high performance MPI-2 implementation for a wide variety of platforms. Open MPI has recently been ported to the Cray XT3 platform. This paper discusses the challenges of porting and describes important implementation decisions. A comparison of performance results between Open MPI and the Cray supported implementation of MPICH2 are also presented.
متن کاملOpen MPI for Cray XE/XK Systems
Open MPI provides an implementation of the MPI standard supporting communication over a range of highperformance network interfaces. Recently, Oak Ridge National Laboratory (ORNL) and Los Alamos National Laboratory (LANL) collaborated on creating a port of Open MPI for Gemini, the network interface for Cray XE and XK systems. In this paper, we present our design and implementation of Open MPI’s...
متن کاملOptimizing MPI Collectives for X1
Traditionally MPI collective operations have been based on point-to-point messages, with possible optimizations for system topologies and communication protocols. The Cray X1 scatter/gather hardware and shared memory mapping features allow for significantly different approaches to MPI collectives leading to substantial performance gains over standard methods, especially for short message length...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011